Overview

Brought to you by YData

Dataset statistics

Number of variables15
Number of observations320696
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory36.7 MiB
Average record size in memory120.0 B

Variable types

Categorical4
Numeric8
Text2
DateTime1

Alerts

group_number has constant value "2" Constant
day has constant value "Wednesday" Constant
batch_number has constant value "4" Constant
bath is highly overall correlated with house_sizeHigh correlation
bed is highly overall correlated with house_sizeHigh correlation
house_size is highly overall correlated with bath and 1 other fieldsHigh correlation
bed is highly skewed (γ1 = 181.4931731) Skewed
bath is highly skewed (γ1 = 72.34021908) Skewed
acre_lot is highly skewed (γ1 = 113.5876888) Skewed

Reproduction

Analysis started2025-06-02 05:02:09.999533
Analysis finished2025-06-02 05:02:48.126013
Duration38.13 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

group_number
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
2
320696 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters320696
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 320696
100.0%

Length

2025-06-02T05:02:48.310809image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-02T05:02:48.506084image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
2 320696
100.0%

Most occurring characters

ValueCountFrequency (%)
2 320696
100.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 320696
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 320696
100.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 320696
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 320696
100.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 320696
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 320696
100.0%

day
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
Wednesday
320696 

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters2886264
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWednesday
2nd rowWednesday
3rd rowWednesday
4th rowWednesday
5th rowWednesday

Common Values

ValueCountFrequency (%)
Wednesday 320696
100.0%

Length

2025-06-02T05:02:48.710099image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-02T05:02:48.952023image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
wednesday 320696
100.0%

Most occurring characters

ValueCountFrequency (%)
e 641392
22.2%
d 641392
22.2%
W 320696
11.1%
n 320696
11.1%
s 320696
11.1%
a 320696
11.1%
y 320696
11.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2886264
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 641392
22.2%
d 641392
22.2%
W 320696
11.1%
n 320696
11.1%
s 320696
11.1%
a 320696
11.1%
y 320696
11.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2886264
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 641392
22.2%
d 641392
22.2%
W 320696
11.1%
n 320696
11.1%
s 320696
11.1%
a 320696
11.1%
y 320696
11.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2886264
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 641392
22.2%
d 641392
22.2%
W 320696
11.1%
n 320696
11.1%
s 320696
11.1%
a 320696
11.1%
y 320696
11.1%

batch_number
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
4
320696 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters320696
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4 320696
100.0%

Length

2025-06-02T05:02:49.217772image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-02T05:02:49.426802image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
4 320696
100.0%

Most occurring characters

ValueCountFrequency (%)
4 320696
100.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 320696
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 320696
100.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 320696
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 320696
100.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 320696
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 320696
100.0%

brokered_by
Real number (ℝ)

Distinct45710
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54149.016
Minimum0
Maximum110140
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:49.658304image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8727
Q126419
median53083.5
Q379183
95-th percentile105798
Maximum110140
Range110140
Interquartile range (IQR)52764

Descriptive statistics

Standard deviation30154.63
Coefficient of variation (CV)0.55688233
Kurtosis-1.0705028
Mean54149.016
Median Absolute Deviation (MAD)26137.5
Skewness0.075211177
Sum1.7365373 × 1010
Variance9.0930172 × 108
MonotonicityNot monotonic
2025-06-02T05:02:49.959709image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22611 7468
 
2.3%
53016 3986
 
1.2%
16829 1960
 
0.6%
70677 1906
 
0.6%
71243 1847
 
0.6%
30807 1743
 
0.5%
84534 1501
 
0.5%
70650 1361
 
0.4%
33714 1291
 
0.4%
79221 1252
 
0.4%
Other values (45700) 296381
92.4%
ValueCountFrequency (%)
0 2
 
< 0.1%
2 1
 
< 0.1%
3 2
 
< 0.1%
4 2
 
< 0.1%
5 1
 
< 0.1%
6 1
 
< 0.1%
8 73
< 0.1%
10 1
 
< 0.1%
12 3
 
< 0.1%
14 2
 
< 0.1%
ValueCountFrequency (%)
110140 1
 
< 0.1%
110138 11
 
< 0.1%
110135 2
 
< 0.1%
110133 5
 
< 0.1%
110132 2
 
< 0.1%
110131 3
 
< 0.1%
110127 1
 
< 0.1%
110126 14
 
< 0.1%
110123 2
 
< 0.1%
110122 51
< 0.1%

status
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
sold
181057 
for_sale
139639 

Length

Max length8
Median length4
Mean length5.7416993
Min length4

Characters and Unicode

Total characters1841340
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfor_sale
2nd rowfor_sale
3rd rowfor_sale
4th rowfor_sale
5th rowfor_sale

Common Values

ValueCountFrequency (%)
sold 181057
56.5%
for_sale 139639
43.5%

Length

2025-06-02T05:02:50.372921image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-02T05:02:50.712071image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
sold 181057
56.5%
for_sale 139639
43.5%

Most occurring characters

ValueCountFrequency (%)
s 320696
17.4%
o 320696
17.4%
l 320696
17.4%
d 181057
9.8%
f 139639
7.6%
r 139639
7.6%
_ 139639
7.6%
a 139639
7.6%
e 139639
7.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1841340
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
s 320696
17.4%
o 320696
17.4%
l 320696
17.4%
d 181057
9.8%
f 139639
7.6%
r 139639
7.6%
_ 139639
7.6%
a 139639
7.6%
e 139639
7.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1841340
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
s 320696
17.4%
o 320696
17.4%
l 320696
17.4%
d 181057
9.8%
f 139639
7.6%
r 139639
7.6%
_ 139639
7.6%
a 139639
7.6%
e 139639
7.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1841340
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
s 320696
17.4%
o 320696
17.4%
l 320696
17.4%
d 181057
9.8%
f 139639
7.6%
r 139639
7.6%
_ 139639
7.6%
a 139639
7.6%
e 139639
7.6%

price
Real number (ℝ)

Distinct12236
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean397339.42
Minimum300023
Maximum500000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:50.996593image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum300023
5-th percentile315000
Q1349900
median395000
Q3445000
95-th percentile497400
Maximum500000
Range199977
Interquartile range (IQR)95100

Descriptive statistics

Standard deviation56966.595
Coefficient of variation (CV)0.14337011
Kurtosis-1.1061248
Mean397339.42
Median Absolute Deviation (MAD)46000
Skewness0.20603074
Sum1.2742516 × 1011
Variance3.2451929 × 109
MonotonicityNot monotonic
2025-06-02T05:02:51.321743image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
350000 9010
 
2.8%
325000 8474
 
2.6%
425000 8204
 
2.6%
450000 8014
 
2.5%
375000 7552
 
2.4%
399900 6502
 
2.0%
475000 6263
 
2.0%
349900 5981
 
1.9%
399000 5394
 
1.7%
315000 5166
 
1.6%
Other values (12226) 250136
78.0%
ValueCountFrequency (%)
300023 1
< 0.1%
300040 1
< 0.1%
300050 1
< 0.1%
300099 1
< 0.1%
300100 1
< 0.1%
300120 1
< 0.1%
300135 1
< 0.1%
300155 1
< 0.1%
300190 1
< 0.1%
300199 1
< 0.1%
ValueCountFrequency (%)
500000 3730
1.2%
499999 810
 
0.3%
499998 15
 
< 0.1%
499997 8
 
< 0.1%
499995 54
 
< 0.1%
499993 1
 
< 0.1%
499990 221
 
0.1%
499989 3
 
< 0.1%
499988 2
 
< 0.1%
499987 2
 
< 0.1%

bed
Real number (ℝ)

High correlation  Skewed 

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.3523649
Minimum1
Maximum444
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:51.628348image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median3
Q34
95-th percentile5
Maximum444
Range443
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.4339532
Coefficient of variation (CV)0.42774378
Kurtosis55616.175
Mean3.3523649
Median Absolute Deviation (MAD)1
Skewness181.49317
Sum1075090
Variance2.0562219
MonotonicityNot monotonic
2025-06-02T05:02:52.004852image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
3 155576
48.5%
4 100928
31.5%
2 37041
 
11.6%
5 18878
 
5.9%
1 3620
 
1.1%
6 3098
 
1.0%
7 572
 
0.2%
8 496
 
0.2%
9 256
 
0.1%
10 87
 
< 0.1%
Other values (16) 144
 
< 0.1%
ValueCountFrequency (%)
1 3620
 
1.1%
2 37041
 
11.6%
3 155576
48.5%
4 100928
31.5%
5 18878
 
5.9%
6 3098
 
1.0%
7 572
 
0.2%
8 496
 
0.2%
9 256
 
0.1%
10 87
 
< 0.1%
ValueCountFrequency (%)
444 2
 
< 0.1%
41 1
 
< 0.1%
40 1
 
< 0.1%
38 1
 
< 0.1%
33 1
 
< 0.1%
22 2
 
< 0.1%
21 2
 
< 0.1%
20 2
 
< 0.1%
18 7
< 0.1%
17 2
 
< 0.1%

bath
Real number (ℝ)

High correlation  Skewed 

Distinct22
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4906609
Minimum1
Maximum222
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:52.330608image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile4
Maximum222
Range221
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.0113037
Coefficient of variation (CV)0.40603827
Kurtosis14733.262
Mean2.4906609
Median Absolute Deviation (MAD)1
Skewness72.340219
Sum798745
Variance1.0227351
MonotonicityNot monotonic
2025-06-02T05:02:52.580597image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
2 152966
47.7%
3 114246
35.6%
4 27883
 
8.7%
1 22782
 
7.1%
5 2225
 
0.7%
6 417
 
0.1%
7 75
 
< 0.1%
8 62
 
< 0.1%
9 12
 
< 0.1%
10 5
 
< 0.1%
Other values (12) 23
 
< 0.1%
ValueCountFrequency (%)
1 22782
 
7.1%
2 152966
47.7%
3 114246
35.6%
4 27883
 
8.7%
5 2225
 
0.7%
6 417
 
0.1%
7 75
 
< 0.1%
8 62
 
< 0.1%
9 12
 
< 0.1%
10 5
 
< 0.1%
ValueCountFrequency (%)
222 2
< 0.1%
113 2
< 0.1%
25 1
 
< 0.1%
21 1
 
< 0.1%
20 1
 
< 0.1%
19 1
 
< 0.1%
16 3
< 0.1%
15 2
< 0.1%
14 1
 
< 0.1%
13 3
< 0.1%

acre_lot
Real number (ℝ)

Skewed 

Distinct2854
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.985941
Minimum0
Maximum100000
Zeros524
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:52.865279image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.04
Q10.14
median0.21
Q30.37
95-th percentile2.9
Maximum100000
Range100000
Interquartile range (IQR)0.23

Descriptive statistics

Standard deviation744.3326
Coefficient of variation (CV)74.538054
Kurtosis14063.843
Mean9.985941
Median Absolute Deviation (MAD)0.09
Skewness113.58769
Sum3202451.3
Variance554031.03
MonotonicityNot monotonic
2025-06-02T05:02:53.181588image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.14 14934
 
4.7%
0.17 14784
 
4.6%
0.16 13106
 
4.1%
0.15 12515
 
3.9%
0.18 11785
 
3.7%
0.23 11612
 
3.6%
0.19 10722
 
3.3%
0.13 10259
 
3.2%
0.2 9276
 
2.9%
0.11 8823
 
2.8%
Other values (2844) 202880
63.3%
ValueCountFrequency (%)
0 524
 
0.2%
0.01 1492
 
0.5%
0.02 4029
1.3%
0.03 5628
1.8%
0.04 5841
1.8%
0.05 5070
1.6%
0.06 4461
1.4%
0.07 4196
1.3%
0.08 3728
1.2%
0.09 4880
1.5%
ValueCountFrequency (%)
100000 11
< 0.1%
90124 1
 
< 0.1%
86139 1
 
< 0.1%
85442 1
 
< 0.1%
80135 1
 
< 0.1%
65145 1
 
< 0.1%
50120 1
 
< 0.1%
49312 2
 
< 0.1%
48136 1
 
< 0.1%
43560 2
 
< 0.1%

street
Real number (ℝ)

Distinct297041
Distinct (%)92.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean931214.71
Minimum614
Maximum2000878
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:53.488764image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum614
5-th percentile104493.25
Q1466395.25
median932876
Q31396147.5
95-th percentile1751390.8
Maximum2000878
Range2000264
Interquartile range (IQR)929752.25

Descriptive statistics

Standard deviation531165.7
Coefficient of variation (CV)0.57040089
Kurtosis-1.2112578
Mean931214.71
Median Absolute Deviation (MAD)464907.5
Skewness-0.0031399269
Sum2.9863683 × 1011
Variance2.82137 × 1011
MonotonicityNot monotonic
2025-06-02T05:02:53.835874image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1070683 4
 
< 0.1%
1372701 4
 
< 0.1%
1708619 3
 
< 0.1%
1239644 3
 
< 0.1%
1418212 3
 
< 0.1%
1564648 3
 
< 0.1%
1074475 3
 
< 0.1%
113008 3
 
< 0.1%
914969 3
 
< 0.1%
201177 3
 
< 0.1%
Other values (297031) 320664
> 99.9%
ValueCountFrequency (%)
614 1
< 0.1%
1796 1
< 0.1%
2905 1
< 0.1%
3161 1
< 0.1%
3913 1
< 0.1%
3920 1
< 0.1%
4276 1
< 0.1%
4279 1
< 0.1%
4321 1
< 0.1%
4331 1
< 0.1%
ValueCountFrequency (%)
2000878 1
< 0.1%
2000549 1
< 0.1%
1998390 1
< 0.1%
1997958 1
< 0.1%
1996636 1
< 0.1%
1995345 1
< 0.1%
1994989 1
< 0.1%
1994678 1
< 0.1%
1994246 1
< 0.1%
1994239 1
< 0.1%

city
Text

Distinct9936
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:54.365140image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length30
Median length25
Mean length8.9727249
Min length2

Characters and Unicode

Total characters2877517
Distinct characters54
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2352 ?
Unique (%)0.7%

Sample

1st rowAgawam
2nd rowPelham
3rd rowAmherst
4th rowAmherst
5th rowAmherst
ValueCountFrequency (%)
city 7832
 
1.9%
saint 5079
 
1.2%
houston 4860
 
1.2%
fort 4458
 
1.1%
valley 4451
 
1.1%
lake 3704
 
0.9%
beach 3694
 
0.9%
phoenix 3628
 
0.9%
san 3593
 
0.9%
tucson 3035
 
0.7%
Other values (7940) 368981
89.3%
2025-06-02T05:02:55.127189image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 271817
 
9.4%
a 250615
 
8.7%
o 220433
 
7.7%
n 209213
 
7.3%
l 193351
 
6.7%
r 186272
 
6.5%
i 171090
 
5.9%
t 154101
 
5.4%
s 119041
 
4.1%
92619
 
3.2%
Other values (44) 1008965
35.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2877517
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 271817
 
9.4%
a 250615
 
8.7%
o 220433
 
7.7%
n 209213
 
7.3%
l 193351
 
6.7%
r 186272
 
6.5%
i 171090
 
5.9%
t 154101
 
5.4%
s 119041
 
4.1%
92619
 
3.2%
Other values (44) 1008965
35.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2877517
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 271817
 
9.4%
a 250615
 
8.7%
o 220433
 
7.7%
n 209213
 
7.3%
l 193351
 
6.7%
r 186272
 
6.5%
i 171090
 
5.9%
t 154101
 
5.4%
s 119041
 
4.1%
92619
 
3.2%
Other values (44) 1008965
35.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2877517
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 271817
 
9.4%
a 250615
 
8.7%
o 220433
 
7.7%
n 209213
 
7.3%
l 193351
 
6.7%
r 186272
 
6.5%
i 171090
 
5.9%
t 154101
 
5.4%
s 119041
 
4.1%
92619
 
3.2%
Other values (44) 1008965
35.1%

state
Text

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:55.492025image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length13
Mean length8.1586019
Min length4

Characters and Unicode

Total characters2616431
Distinct characters46
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMassachusetts
2nd rowMassachusetts
3rd rowMassachusetts
4th rowMassachusetts
5th rowMassachusetts
ValueCountFrequency (%)
florida 39633
 
11.2%
texas 37637
 
10.6%
california 32118
 
9.1%
arizona 22715
 
6.4%
georgia 15665
 
4.4%
new 15118
 
4.3%
washington 13881
 
3.9%
carolina 12872
 
3.6%
virginia 12625
 
3.6%
maryland 12235
 
3.5%
Other values (49) 139068
39.3%
2025-06-02T05:02:56.160401image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 365017
14.0%
i 288673
 
11.0%
o 233625
 
8.9%
n 225209
 
8.6%
r 191314
 
7.3%
s 158783
 
6.1%
e 148561
 
5.7%
l 144513
 
5.5%
d 69782
 
2.7%
t 64085
 
2.4%
Other values (36) 726869
27.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2616431
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 365017
14.0%
i 288673
 
11.0%
o 233625
 
8.9%
n 225209
 
8.6%
r 191314
 
7.3%
s 158783
 
6.1%
e 148561
 
5.7%
l 144513
 
5.5%
d 69782
 
2.7%
t 64085
 
2.4%
Other values (36) 726869
27.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2616431
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 365017
14.0%
i 288673
 
11.0%
o 233625
 
8.9%
n 225209
 
8.6%
r 191314
 
7.3%
s 158783
 
6.1%
e 148561
 
5.7%
l 144513
 
5.5%
d 69782
 
2.7%
t 64085
 
2.4%
Other values (36) 726869
27.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2616431
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 365017
14.0%
i 288673
 
11.0%
o 233625
 
8.9%
n 225209
 
8.6%
r 191314
 
7.3%
s 158783
 
6.1%
e 148561
 
5.7%
l 144513
 
5.5%
d 69782
 
2.7%
t 64085
 
2.4%
Other values (36) 726869
27.8%

zip_code
Real number (ℝ)

Distinct16479
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55301.378
Minimum612
Maximum99901
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:56.460194image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum612
5-th percentile8088
Q130101
median55406
Q385034
95-th percentile97424
Maximum99901
Range99289
Interquartile range (IQR)54933

Descriptive statistics

Standard deviation30004.256
Coefficient of variation (CV)0.54255893
Kurtosis-1.420597
Mean55301.378
Median Absolute Deviation (MAD)26072
Skewness-0.059961202
Sum1.7734931 × 1010
Variance9.0025541 × 108
MonotonicityNot monotonic
2025-06-02T05:02:56.795283image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
85138 956
 
0.3%
85326 545
 
0.2%
85122 521
 
0.2%
77494 486
 
0.2%
77007 456
 
0.1%
34746 430
 
0.1%
92345 413
 
0.1%
76179 408
 
0.1%
85375 403
 
0.1%
85143 399
 
0.1%
Other values (16469) 315679
98.4%
ValueCountFrequency (%)
612 1
< 0.1%
677 1
< 0.1%
718 1
< 0.1%
725 1
< 0.1%
727 1
< 0.1%
736 1
< 0.1%
738 1
< 0.1%
745 1
< 0.1%
765 2
< 0.1%
778 1
< 0.1%
ValueCountFrequency (%)
99901 7
 
< 0.1%
99827 1
 
< 0.1%
99824 1
 
< 0.1%
99801 19
< 0.1%
99712 5
 
< 0.1%
99709 10
< 0.1%
99705 14
< 0.1%
99701 4
 
< 0.1%
99672 1
 
< 0.1%
99669 5
 
< 0.1%

house_size
Real number (ℝ)

High correlation 

Distinct5080
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1981.6458
Minimum100
Maximum54892
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2025-06-02T05:02:57.090409image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile1021
Q11484
median1872
Q32356
95-th percentile3255
Maximum54892
Range54792
Interquartile range (IQR)872

Descriptive statistics

Standard deviation796.87584
Coefficient of variation (CV)0.40212828
Kurtosis221.36423
Mean1981.6458
Median Absolute Deviation (MAD)428
Skewness6.9225109
Sum6.3550589 × 108
Variance635011.1
MonotonicityNot monotonic
2025-06-02T05:02:57.387668image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1800 1308
 
0.4%
2000 1053
 
0.3%
1600 1051
 
0.3%
1500 985
 
0.3%
1200 979
 
0.3%
2200 907
 
0.3%
1440 881
 
0.3%
1700 826
 
0.3%
2100 822
 
0.3%
1400 820
 
0.3%
Other values (5070) 311064
97.0%
ValueCountFrequency (%)
100 2
< 0.1%
121 2
< 0.1%
150 1
< 0.1%
170 1
< 0.1%
200 2
< 0.1%
210 1
< 0.1%
228 1
< 0.1%
251 1
< 0.1%
255 1
< 0.1%
266 1
< 0.1%
ValueCountFrequency (%)
54892 1
 
< 0.1%
40720 1
 
< 0.1%
39996 1
 
< 0.1%
36808 1
 
< 0.1%
34690 1
 
< 0.1%
29997 9
< 0.1%
28750 1
 
< 0.1%
24400 1
 
< 0.1%
21519 2
 
< 0.1%
21500 1
 
< 0.1%
Distinct10772
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
Minimum1901-01-01 00:00:00
Maximum2024-04-17 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-06-02T05:02:57.726765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:58.065138image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2025-06-02T05:02:42.775439image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:27.410575image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:29.692899image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:32.002566image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:34.131040image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:36.200775image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:38.351020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:40.691782image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:43.031640image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:27.729602image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:29.978378image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:32.239997image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:34.373633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:36.481304image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:38.641393image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:40.991447image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:43.317607image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:28.015644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:30.311856image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:32.510858image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:34.637846image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:36.823088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:38.956786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:41.276465image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:43.588816image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:28.262668image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:30.570994image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:32.741262image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:34.871170image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:37.060115image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:39.217698image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:41.506732image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:43.828040image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:28.516802image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:30.866683image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:32.987821image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:35.158005image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:37.292374image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:39.476193image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:41.754197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:44.112794image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:28.767519image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:31.143637image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:33.273447image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:35.451769image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:37.544263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:39.758469image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:42.001781image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:44.404772image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:29.081978image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:31.453537image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:33.574131image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:35.717345image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:37.830976image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:40.035844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:42.275729image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:45.144069image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:29.386598image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:31.732953image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:33.850557image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:35.947184image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:38.082089image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:40.384792image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-06-02T05:02:42.525388image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Correlations

2025-06-02T05:02:58.305531image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
acre_lotbathbedbrokered_byhouse_sizepricestatusstreetzip_code
acre_lot1.0000.0420.177-0.0260.2610.0220.000-0.014-0.171
bath0.0421.0000.486-0.0230.6510.1420.000-0.015-0.206
bed0.1770.4861.000-0.0160.6210.0930.000-0.005-0.154
brokered_by-0.026-0.023-0.0161.000-0.0330.0140.0250.0030.079
house_size0.2610.6510.621-0.0331.0000.1530.002-0.021-0.204
price0.0220.1420.0930.0140.1531.0000.014-0.0080.108
status0.0000.0000.0000.0250.0020.0141.0000.0160.152
street-0.014-0.015-0.0050.003-0.021-0.0080.0161.0000.009
zip_code-0.171-0.206-0.1540.079-0.2040.1080.1520.0091.000

Missing values

2025-06-02T05:02:45.662392image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.
2025-06-02T05:02:46.734900image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

group_numberdaybatch_numberbrokered_bystatuspricebedbathacre_lotstreetcitystatezip_codehouse_sizeprev_sold_date
02Wednesday497400.0for_sale384900.03.02.00.461244899.0AgawamMassachusetts1001.01476.01986-11-20
12Wednesday422188.0for_sale419000.04.02.02.001417448.0PelhamMassachusetts1002.01607.02005-07-25
22Wednesday448733.0for_sale415000.04.02.00.49816546.0AmherstMassachusetts1002.01814.01997-06-30
32Wednesday448733.0for_sale390000.03.02.00.50236564.0AmherstMassachusetts1002.01128.01998-09-17
42Wednesday448733.0for_sale499900.04.03.00.46197884.0AmherstMassachusetts1002.02143.01988-06-10
52Wednesday427510.0for_sale439900.03.02.00.201032285.0AmherstMassachusetts1002.01349.02005-09-02
62Wednesday4110138.0for_sale397000.02.02.00.511212530.0GranbyMassachusetts1033.02000.02001-12-13
72Wednesday4101392.0for_sale389900.02.02.00.92366219.0WhatelyMassachusetts1093.01723.01997-09-25
82Wednesday4736.0for_sale459900.04.03.03.80152461.0SunderlandMassachusetts1375.02768.02009-07-17
92Wednesday48147.0for_sale324900.04.02.00.56985872.0GranbyMassachusetts1033.01523.02007-09-19
group_numberdaybatch_numberbrokered_bystatuspricebedbathacre_lotstreetcitystatezip_codehouse_sizeprev_sold_date
3206862Wednesday433745.0sold310000.03.01.00.231594744.0RichlandWashington99354.01845.02022-02-14
3206872Wednesday4108210.0sold425000.04.03.00.25215391.0RichlandWashington99354.02732.02022-02-14
3206882Wednesday4108243.0sold425000.03.03.00.06970797.0RichlandWashington99354.01876.02022-02-14
3206892Wednesday416235.0sold305000.04.02.00.42353937.0RichlandWashington99354.02000.02022-02-11
3206902Wednesday453860.0sold310000.03.01.00.21500240.0RichlandWashington99354.01152.02022-02-11
3206912Wednesday460631.0sold385000.04.02.00.21210890.0RichlandWashington99354.01656.02022-03-28
3206922Wednesday485499.0sold339900.04.02.00.2041160.0RichlandWashington99354.02780.02022-03-28
3206932Wednesday423009.0sold359900.04.02.00.33353094.0RichlandWashington99354.03600.02022-03-25
3206942Wednesday418208.0sold350000.03.02.00.101062149.0RichlandWashington99354.01616.02022-03-25
3206952Wednesday476856.0sold440000.06.03.00.50405677.0RichlandWashington99354.03200.02022-03-24